Identification and Molecular Characterization of Novel Mycoviruses in Saccharomyces and Non-Saccharomyces Yeasts of Oenological Interest

Wine yeasts can be natural hosts for dsRNA, ssRNA viruses and retrotransposon elements. In this study, high-throughput RNA sequencing combined with bioinformatic analyses unveiled the virome associated to 16 Saccharomyces cerevisiae and 8 non-Saccharomyces strains of oenological interest. Results showed the presence of six viruses and two satellite dsRNAs from four different families, two of which—Partitiviridae and Mitoviridae—were not reported before in yeasts, as well as two ORFan contigs of viral origin. According to phylogenetic analysis, four new putative mycoviruses distributed in Totivirus, Cryspovirus, and Mitovirus genera were identified. The majority of commercial S. cerevisiae strains were confirmed to be the host for helper L-A type totiviruses and satellite M dsRNAs associated with the killer phenotype, both in single and mixed infections with L-BC totiviruses, and two viral sequences belonging to a new cryspovirus putative species discovered here for the first time. Moreover, single infection by a narnavirus 20S-related sequence was also found in one S. cerevisiae strain. Considering the non-Saccharomyces yeasts, Starmerella bacillaris hosted four RNAs of viral origin—two clustering in Totivirus and Mitovirus genera, and two ORFans with putative satellite behavior. This study confirmed the infection of wine yeasts by viruses associated with useful technological characteristics and demonstrated the presence of complex mixed infections with unpredictable biological effects.


Introduction
A wide diversity of mycoviruses has been reported in the major fungal taxa since the description of the first species in 1962 [1][2][3]. Fungal viruses lack an extracellular phase as free virions, and are transmitted intracellularly during cell division, cell fusion, and sporogenesis [1,3] with noticeable exceptions within the family Genomoviridae [4,5]. Mycovirus infections are usually cryptic or associated with mild symptoms, but in some cases, they can induce growth alteration, reduce the virulence, or kill other individuals of their host species [1,6,7].
The over 500 fungal viruses reported in the literature are characterized by simple or multi-segmented genomes which are packaged in isometric, spherical, bacilliform, or filamentous particles [8][9][10]. The majority of fungal viruses have double-stranded RNA Double-strand RNA elements found in non-Saccharomyces yeast of oenological interest include only members of the family Totiviridae involved in the killer phenotype of the host [43]. L-A and M dsRNA types were characterized in Hanseniaspora uvarum and Zygosaccharomyces bailii [44] and Torulaspora delbrueckii [45].
Budding yeast is also a model eukaryotic host for the virus infection of plant and animal viruses [46], providing numerous insights in the genetics of the host factors involved in virus replication [47]. In this study, we applied RNAseq analysis to verify the presence of mycoviruses in both Saccharomyces and non-Saccharomyces isolates of oenological interest. Our aim was to lay the bases for future studies assessing the effect of the virus/yeast interaction with possible new virus types found through NGS approaches, as well as to identify viruses in new yeast isolates with technological characteristics useful for the production of high-quality wines.

Yeast Isolate Maintenance
Twenty-four yeast strains of oenological interest were investigated in this study (Table 1). Among them, 16 were S. cerevisiae strains used as commercial starters in oenology that were collected from packages of active dry yeasts purchased on the market. The remaining eight non-Saccharomyces species came from the collections of the Regional Insti-

Total RNA Extraction and RNAseq
Total RNA was extracted from yeast pellets using the E.Z.N.A. Fungal RNA Mini Kit (Omega Bio-tek, Norcross, GA, USA) and resuspended in 30 µL of RNase-free water. Nucleic acid extracts were analyzed in a NanoDrop 2000 Spectrophotometer (Thermoscientific, Waltham, MA, USA) to evaluate their concentration and estimate their purity, and then stored at −80 • C. A total of 24 RNA samples were pooled by mixing 1 µg of RNA from each fungal sample to sequence more than one isolate per library. Eight µg total RNA were sent to sequencing facilities of Macrogen (Seoul, Korea) who performed the ribosomal RNA (rRNA) depletion (Ribo-ZeroTM Gold Kit, Epicentre, Madison, WI, USA), the cDNA library building (TrueSeq total RNA sample kit, Illumina, San Diego, CA, USA), and sequencing by an Illumina HiSeq4000 system generating paired-end sequences.

ORFan Contig Detection
The assembled contigs were blasted, using DIAMOND (0.9.26), against the NCBI non-redundant whole database (version-October 2019). All hits were discarded, while the unmatched contigs with nucleic size over 1 kb and encoding a protein of at least 150 amino acids (~15 kDa) were kept. The latter were used to map reads considering their orientation, i.e., whether they mapped in sense or antisense orientation. Contigs with only positive or negative mapped reads were discarded, while the remaining contigs formed the ORFans pool [11].

Virus Detection in Specific Isolates
The distribution of the in silico-assembled viruses was verified for each of the 24 yeast isolates by reverse transcription followed by real-time PCR. To this aim, total RNA extracts were used as a template for cDNA synthesis using the High-Capacity cDNA Reverse Transcription Kit (Thermo Fisher scientific, Waltham, MA, USA) following manufacturer's instructions. Real-time PCR reactions were carried out in a CFX Connect real-time PCR Detection System (Bio-Rad Laboratories, Hercules, CA, USA), adding 2 µL of 1:10 cDNA dilutions (in sterile distilled water, SDW) to 20 µL real-time PCR mix containing iTaq Universal SYBR Green Supermix, 200 nM (each) primers and SDW. A melting curve analysis was performed to check for unspecific real-time PCR products. In order to confirm the sequence of the assembled contigs, fragments of viral genomes were amplified by PCR reactions on the cDNA using specific primers designed in this study. All primers used in this work are reported in the Supplementary Table S1. Purified PCR products were selected and sent to the Eurofins Genomics sequencing service (Eurofins Genomics, Ebersberg, Bayern, Germany). All sequences determined in this work were submitted to the NBCI, and accession numbers were assigned (Table 2).

ORFan Detection in Specific Isolates and DNA Integration Assay
In silico detected ORFan contigs were searched in the twenty-four isolates studied through real-time PCR using the same procedure described for viral detection. Real-time PCR primers targeting the specific ORFan contigs are reported in Supplementary Table  S1. For each yeast isolate, real-time PCR reactions were performed on cDNA and on total nucleic acids to verify if DNA corresponding to the ORFan coding sequence is present, therefore confirming (or excluding) the exclusive RNA nature of the ORFan, thus excluding (or confirming) an integration in the host genome or a DNA intermediate during replication.

ORF Prediction and Phylogenetic Analyses
ORF predictions were performed using the ORF finder tool from NCBI, and predictions were made selecting the "standard" genetic code for all viral contigs, except the one closely related to mitoviruses, generally hosted in the mitochondria. The putative function of the predicted protein was established by BLASTp analysis, looking at the function of the closest proteins in the NCBI database. The predicted protein sequences were analyzed through a BLASTp search using the domain finder option to evaluate the presence of any conserved domain in the sequence (such as the viral polymerase GDD conserved domain).
To assess the genomic variability of the totivirus isolates associated to the killer phenotype, ScV-L-A and ScV-L-BC CPs as well as of the ScV-M dsRNAs, nucleotide sequences were trimmed, aligned by CLUSTALW implemented in the MEGA X software [55] and compared to sequences of isolates retrieved from the GenBank (Supplementary Table S2). Phylogenetic trees were obtained using the Maximum likelihood method with the Kimura 2-parameter model and 1000 bootstrap replicates.
For the remaining phylogenetic analyses, multiple sequence alignments of viral RdRp amino acid sequences were performed with CLUSTAL Omega, and gaps were removed using trimAl v1.3 (strict mode) [56]. Maximum likelihood phylogenetic trees were constructed using the RtREV with Freqs. (+F) model and the Gamma Distributed with Invariant Sites (G+I) implemented in the MEGA X software. Bootstrap values are from 1000 replicates. Percentage identity values were obtained using the p-distance method of MEGA X, which automatically calculates the distances for nucleotide and amino acid sequences.

Results
A single sequencing run of the 24 pooled yeast isolates produced 135,345,688 total reads deposited in Sequence Read Archive (SRA) on the NCBI server (Bioproject PRJNA768663). Following the Trinity run, a total of 25,321 contigs were obtained. A BLAST search of a custom prepared viral database identified 13 putative viral contigs. Among those, seven contigs contained almost complete viral genomes; three contigs encoded viral RdRp, one a CP sequence, and two contigs viral toxins. In all, our bioinformatics pipeline unveiled the presence of six distinct mycoviruses and two satellite dsRNAs included in Mitoviridae, Narnaviridae, Partitiviridae, and Totiviridae families. Reverse transcription followed by realtime PCR assays confirmed the presence of viral infections in 15 out 24 tested isolates, with 13/15 mixed infections (Table 3).

Totivirus Related Sequences
From the RNAseq assembly, we identified two dsRNA viral species, ScV-L-A and ScV-L-BC, included in the genus Totivirus (family Totiviridae) and one satellite dsRNA, ScV-M ( Table 3).
The presence of ScV-L-A was revealed by a contig of 3456 nucleotides showing 99% identity with the reference L-A-2 strain 8F-13 (GenBank KC677754), encoding the putative viral RdRp (partial) and the CP (complete) ORFs. ScV-L-A infection was confirmed in 12 yeast isolates, and the presence of different genetic variants, two L-A-lus and ten L-A-2, was verified by sequence alignment and the phylogenetic analysis of a 647 bp-long fragment of the CP gene (Supplementary Figure S1). According to pairwise analysis, ScV-L-A CP sequences shared among them from 77.6% to 100% and from 95.3% to 100% of nucleotide and amino acid identity, respectively (Supplementary Table S3). Further analysis of the CP partial sequence revealed that the two isolates LA-lus-D254 and LA-lus-U43 shared 85.3% and 97.2% of nucleotide and amino acid identity between them, and values higher than 84.9% and 96.7% of nucleotide and amino acid identity with the reference L-A-lus sequence (GenBank JN819511), respectively. ScV-L-A isolates clustering in the L-A-2 phylogenetic group shared nucleotide and amino acid identities >99% among them as well as with the L-A-2 reference strain 8F-13 (Supplementary Table S3). In particular, the isolates LA2-L2226 showed 100% amino acid identity with LA2-2323, LA2-VL1, and LA2-FX10, as well as with the reference L-A-2 sequence. The isolates LA2-EC1118, LA2-QA23, LA2-X5, LA2-VR5, LA2-PDM, and LA2-EZ44 shared 100% amino acid identity among them, and 99.5% with the L-A-2 strain 8F-13.
Two contigs of 1165 and 308 bp corresponded to ScV-M-2 and ScV-M-lus satellite dsRNAs encoding the complete K-2 and the partial (73 aa) K-lus killer toxins, respectively. ScV-M dsRNAs were detected in 12 out 24 S. cerevisiae strains; among them, 11 strains carried the M-2 type and one strain the M-lus type (Table 3).
Following pairwise analysis, ScV-M-2 sequences obtained in this work shared among them from 98.6% to 100% and from 97.5% to 100% of nucleotide and amino acid identity, respectively (Supplementary Table S4). A Further pairwise comparison including ScV-M-2 sequences from the GenBank showed nucleotide and amino acid identity values ranging from 98.2-100% and 71.8-100%, respectively (Supplementary Table S4). Considering the overall structure of K2 preprotoxins, the presence of a conserved hydrophobic amino terminal portion (aa 27-45), three potential N-glycosylation sites (aa 177-180, 214-217, 261-264), and two cleavage sites (KR) recognized by the KEX1 (aa 220-221) and KEX2 (aa 267-268) proteases, was verified in all isolates except M2-FX10. In this latter case, we found a substitution of lysine with glutamic acid at position 220 at the first KR site. Moreover, we found additional amino acid substitutions in eight different positions of the K2 preprotoxins from different isolates as reported in Supplementary Figure S2.
The comparison of a 292 bp fragment located at the 5 of M-lus RNA from Lalvin ICV D254 showed 97% nucleotide identity with the corresponding region of the reference strain Mlus-4 (GenBank GU723494). Amino acid sequence analysis predicted a defective ORF due to the presence of a stop codon at position 232-234 that was confirmed not only in the in-silico assembly, but also through Sanger sequencing of a PCR fragment obtained by RT-PCR.
Infection with ScV-L-BC was revealed by four contigs, one of 4211 bp and three of 4601 bp, sharing from 94.4% to 99.5% nucleotide identity with the corresponding genomic regions of the reference strain ScV-L-BC-2 (GenBank KX906605). All four contigs contained complete ORFs encoding RdRp and CP proteins. Eight out of the 24 tested yeasts were positive for ScV-L-BC infection. According to the sequence analysis of an 801 bp-long fragment covering partial RdRp and CP genes, two commercial S. cerevisiae strains carried the ScV-L-BC-La type and six strains the ScV-L-BC-2 type (Table 3). Considering pairwise distances, ScV-L-BC sequences shared among them from 95.5% to 100% and from 97.4% to 100% of nucleotide and amino acid identity, respectively (Supplementary Table S5 Table S5). The highest identity values were obtained among ScV-L-BC-2 type variants, sharing from 99.8% to 100% nucleotide and 100% amino acid identity, both among them and with the reference strain ScV-L-BC-2 (GenBank KX906605) (Supplementary Table S5).
A contig of 5878 bp corresponding to a previously unknown virus within the family Totiviridae was identified by our bioinformatic pipeline and detected by real-time PCR on the cDNA obtained from S. bacillaris Cz12 total RNA (Table 3). A similarity search by BLASTp with the predicted amino acid sequence as query showed the maximum identity score with Red algae totivirus 1 Chiba7 (RaTV1, GenBank LC521327) from the red algae holobiont [55]; according to this result, the sequence was named Starmerella bacillaris totivirus 1 (SbTV1). A 5654 bp-long fragment of the SbTV1 genome was verified by Sanger sequencing of overlapping RT-PCR amplified segments with specific primers designed in this study (Supplementary Table S1). The sequence contained two distinct ORFs of 2649 bp (ORF2) and 2829 bp (ORF1) encoding the complete viral RdRp (882 aa) and CP (942 aa), respectively. SbTV1 showed the maximum CP amino acid identity value (40.3%) with Magnaporthe oryzae virus 2 (MoV2, GenBank BCL64969), while the RdRp highest identity value (43.9%) was obtained with RaTV1 (GenBank BBZ90082), respectively (Supplementary Tables S6 and S7). An overlap region (AUGA; nt 3166 to 3169) for the ribosomal slippage was found between the CP and RdRp ORFs. A phylogenetic analysis of complete RdRp amino acid sequences grouped SbTV1 and RaTV1 in a separated cluster within the Totiviridae family, confirming their close relationship (Figure 1).

Partitivirus Related Sequences
Two contigs of 1689 bp and 1321 bp corresponding to RdRp (RNA1) and CP (RNA2) of an unknown virus within the family Partitiviridae were identified among the viral proteins database. A BLASTp search revealed the highest amino acid identity with Cryptosporidium parvum virus 1 (CSpV1, GenBank AAC47805). Reverse transcription followed by real-time PCR showed that two S. cerevisiae strains, ICV D254 and UVAFERM43, hosted both the RNA1 and the RNA2 of the putative partitivirus ( Table 3). Fragments of 1552 bp and 1047 bp encoding the complete RdRp and CP ORFs were sequenced for both isolates, using specific primers designed in this study (Supplementary Table S1). Among them, D254 and U43 sequences showed 97.7% nucleotide and 99.4% amino acid identity values for the RdRp, and 98.9% nucleotide and 98.7% amino acid identity values for the CP, respectively. A pairwise comparison of D254 and U43 RdRp and CP complete sequences with all CSpV1 isolates from the GenBank showed the highest identity values (37% and 21.8%) with Cryptosporidium parvum virus 1 (GenBank BAU19319) RdRp and Cryptosporidium meleagridis virus CP (GenBank ABB02501), respectively (Supplementary Tables S8 and  S9). A phylogenetic analysis of partial D254 and U43 RdRp (515 aa) sequences confirmed their position with members of the Cryspovirus cluster of Partitiviridae (Figure 2). According to these results, the two sequences were proposed as members of a new putative species provisionally named Saccharomyces cerevisiae cryspovirus 1 (ScCV1).

Narnavirus Related Sequences
The presence of a narnavirus in the pooled RNA was revealed by a contig of 2495 nt sharing 94.75% nucleotide identity with the S. cerevisiae narnavirus 20S RNA W (ScNV-20S, GenBank AF039063). The diagnostic screening showed that only S. cerevisiae NDA21 hosted this narnavirus. A comparison of the NV-NDA21 complete RdRp sequence amplified with specific primers showed amino acid identities values of 98.6% and 98.7% with the two ScNV 20S sequences available in the GenBank (AAA47824 and NP_660178), 55.9% with ScNV-I329 (GenBank QFU28543), and 30% with ScNV 23S (GenBank NP_660177) (Supplementary Table S10). A maximum likelihood phylogenetic analysis of RdRp (829 aa) confirmed its position in the ScNV-20S clade (Figure 3).
Magnaporthe oryzae virus 2 (MoV2, GenBank BCL64969), while the RdRp highest identity value (43.9%) was obtained with RaTV1 (GenBank BBZ90082), respectively (Supplementary Tables S6 and S7). An overlap region (AUGA; nt 3166 to 3169) for the ribosomal slippage was found between the CP and RdRp ORFs. A phylogenetic analysis of complete RdRp amino acid sequences grouped SbTV1 and RaTV1 in a separated cluster within the Totiviridae family, confirming their close relationship (Figure 1).

Partitivirus Related Sequences
Two contigs of 1689 bp and 1321 bp corresponding to RdRp (RNA1) and CP (RNA2) of an unknown virus within the family Partitiviridae were identified among the viral proteins database. A BLASTp search revealed the highest amino acid identity with

Mitovirus Related Sequences
From the RNAseq assembly, we identified a 2465 nt-long contig encoding for one ORF producing the complete viral RdRp (714 aa) of a putative new mitovirus. The predicted protein alignment and phylogenetic analysis confirmed this sequence as a member of the genus Mitovirus, showing the highest amino acid sequence identity (35.97%) with the RdRp of Sclerotinia sclerotiorum mitovirus 26 (GenBank AWY10984). Reverse transcription followed by real-time PCR attributed this sequence to the S. bacillaris Cz12 isolate (Table 3). A 2297 bp-long fragment containing the complete RdRp ORF was verified following amplification and sequencing with specific primers designed on the contig (Supplementary Table  S1). A BLASTp search showed that the RdRp amino acid sequence of the Cz12 mitovirus is most closely related to those of Sclerotinia sclerotiorum mitovirus 26 (SsMV26; GenBank AWY10984.1), Entomophthora muscae mitovirus 6 (EmMV6; GenBank QCF24450.1), and Erysiphe necator associated mitovirus 29 (EnMV29; GenBank QKI79977.1), with aa identities of 35.97%, 35.74%, and 34.56%, respectively. Sequence analysis evidenced the conserved domain Mitovir RNA pol at position 184-468 and confirmed the presence of the hallmark viral polymerase GDD tripeptide conserved domain, as well as of the six conserved motifs (I-VI) typical of the RdRps of mitochondrial viruses (Figure 4). The Maximum likelihood phylogenetic tree based on the RdRp amino acid sequence (714 aa) confirmed Cz12 within the clade III of mitoviruses [57] (Figure 3). According to these results, the provisional name Starmerella bacillaris mitovirus 1 (SbMV1) was given to this virus.
amino acid identity values for the RdRp, and 98.9% nucleotide and 98.7% amino acid identity values for the CP, respectively. A pairwise comparison of D254 and U43 RdRp and CP complete sequences with all CSpV1 isolates from the GenBank showed the highest identity values (37% and 21.8%) with Cryptosporidium parvum virus 1 (GenBank BAU19319) RdRp and Cryptosporidium meleagridis virus CP (GenBank ABB02501), respectively (Supplementary Tables S8 and S9). A phylogenetic analysis of partial D254 and U43 RdRp (515 aa) sequences confirmed their position with members of the Cryspovirus cluster of Partitiviridae (Figure 2). According to these results, the two sequences were proposed as members of a new putative species provisionally named Saccharomyces cerevisiae cryspovirus 1 (ScCV1).

ORFan Sequences
Our in-silico approach allowed the identification of eighteen contigs encoding for proteins without homology to known peptides deposited on available databases (ORFans). From these, the proportion of reads mapping on the positive or negative sense of each contig was screened manually, and only the contigs showing at least around 1/10 of the negative/positive reads proportion were kept for further analysis. The three resulting contigs that passed the last step were called ORFan1 to ORFan3, and information about their nucleotide sequences (sequence length, amino-acid length of the putative protein encoded, and the corresponding translated ORF) is displayed in Supplementary online Text S1. To detect ORFan1, 2, and 3 presence in the isolates from the collection and to confirm their viral nature as the RNA sequence not integrated in the host genome, real-time PCR amplifications on both total nucleic acids and the corresponding cDNAs were performed. Results obtained allowed for the identification of ORFan 2 and ORFan 3 in Starmerella bacillaris strain Cz12, confirming the absence of a specific signal when performing the PCR reaction on the total nucleic acid, therefore confirming that these segments do not have a corresponding DNA, and derive by RNA-dependent replication. However, ORFan1 was detected in Candida stellata strain DBVPG 6714, but the signal related to this contig was observed also in the total nucleic acid control, suggesting the genomic origin of ORFan1.
Real time PCR products obtained using as a template the cDNA from yeast total RNA were separated through agarose gel electrophoresis (Figure 5a). Taken together, these results show that we were able to identify two ORFan sequences of putative viral origin (ORFan2 and ORFan3) that are found in the host isolate only as the RNA molecule and one sequence (ORFan1) that is found both as RNA and DNA. ORF distribution within the three ORFans is reported in Figure 5b.

Narnavirus Related Sequences
The presence of a narnavirus in the pooled RNA was revealed by a contig of 2495 nt sharing 94.75% nucleotide identity with the S. cerevisiae narnavirus 20S RNA W (ScNV-20S, GenBank AF039063). The diagnostic screening showed that only S. cerevisiae NDA21 hosted this narnavirus. A comparison of the NV-NDA21 complete RdRp sequence amplified with specific primers showed amino acid identities values of 98.6% and 98.7% with the two ScNV 20S sequences available in the GenBank (AAA47824 and NP_660178), 55.9% with ScNV-I329 (GenBank QFU28543), and 30% with ScNV 23S (GenBank NP_660177) (Supplementary Table S10). A maximum likelihood phylogenetic analysis of RdRp (829 aa) confirmed its position in the ScNV-20S clade (Figure 3). analysis evidenced the conserved domain Mitovir RNA pol at position 184-468 and confirmed the presence of the hallmark viral polymerase GDD tripeptide conserved domain, as well as of the six conserved motifs (I-VI) typical of the RdRps of mitochondrial viruses (Figure 4). The Maximum likelihood phylogenetic tree based on the RdRp amino acid sequence (714 aa) confirmed Cz12 within the clade III of mitoviruses [57] (Figure 3). According to these results, the provisional name Starmerella bacillaris mitovirus 1 (SbMV1) was given to this virus.

ORFan Sequences
Our in-silico approach allowed the identification of eighteen contigs encoding for proteins without homology to known peptides deposited on available databases (ORFans). From these, the proportion of reads mapping on the positive or negative sense of each contig was screened manually, and only the contigs showing at least around 1/10 of the negative/positive reads proportion were kept for further analysis. The three resulting contigs that passed the last step were called ORFan1 to ORFan3, and information about their nucleotide sequences (sequence length, amino-acid length of the putative protein encoded, and the corresponding translated ORF) is displayed in Supplementary online Text S1. To detect ORFan1, 2, and 3 presence in the isolates from the collection and to confirm their viral nature as the RNA sequence not integrated in the host genome, realtime PCR amplifications on both total nucleic acids and the corresponding cDNAs were performed. Results obtained allowed for the identification of ORFan 2 and ORFan 3 in Starmerella bacillaris strain Cz12, confirming the absence of a specific signal when performing the PCR reaction on the total nucleic acid, therefore confirming that these segments do not have a corresponding DNA, and derive by RNA-dependent replication. However, ORFan1 was detected in Candida stellata strain DBVPG 6714, but the signal related to this contig was observed also in the total nucleic acid control, suggesting the genomic origin of ORFan1. Real time PCR products obtained using as a template the cDNA from yeast total RNA were separated through agarose gel electrophoresis (Figure 5a). Taken together, these results show that we were able to identify two ORFan sequences of putative viral origin (ORFan2 and ORFan3) that are found in the host isolate only as the RNA molecule and one sequence (ORFan1) that is found both as RNA and DNA. ORF distribution within the three ORFans is reported in Figure 5b.

Discussion
Some yeast species more than others are used in the fermentation of musts for their oenological qualities; nevertheless, the wine industry is constantly looking for new superior strains able to improve the quality of final products [58,59]. In the last fifteen years, many scientific works demonstrated the possibility of selecting non-Saccharomyces yeasts capable of improving the quality of wines produced by controlled mixed fermentation, where the inoculation is carried out with two different strains, a non-Saccharomyces and a Saccharomyces, usually sequentially [60]. The interest for the use of non-Saccharomyces yeasts in oenology is increasing, because they are able to decrease the alcohol content of wines, to increase the final concentration of glycerol and aromatic compounds, to produce

Discussion
Some yeast species more than others are used in the fermentation of musts for their oenological qualities; nevertheless, the wine industry is constantly looking for new superior strains able to improve the quality of final products [58,59]. In the last fifteen years, many scientific works demonstrated the possibility of selecting non-Saccharomyces yeasts capable of improving the quality of wines produced by controlled mixed fermentation, where the inoculation is carried out with two different strains, a non-Saccharomyces and a Saccharomyces, usually sequentially [60]. The interest for the use of non-Saccharomyces yeasts in oenology is increasing, because they are able to decrease the alcohol content of wines, to increase the final concentration of glycerol and aromatic compounds, to produce proteolytic and pectinolytic activities as well as to influence the concentration of polysaccharides [58,61].
Yeasts are popular model organisms for virus research because they can be hosts for plant, animal, and human viruses [28]. Yeast-virus interactions underlie important mechanisms such as the killer system with several potential applications, including the production of food and beverages, biological control of plant pathogens, biocontrol of postharvest fungal alterations [62,63]. It is believed that the loss of anti-viral silencing defense genes in some budding yeast species is linked to the fitness gain obtained by hosting viruses supporting killer toxin replication [64]. Despite their biotechnological potential, the impact of yeast natural infections has been deeply investigated only for the killer system in species of the Saccharomycetaceae family and in few species of non-Saccharomyces yeasts employed for winemaking [43,65,66]. In this work, we searched for new viruses possibly able to affect the technological characteristics of both non-Saccharomyces and Saccharomyces strains using a high-throughput RNAseq approach combined to reverse transcription and real-time PCR, which were confirmed here to be suitable methods to unravel the presence of new viral species even in unicellular fungi [29]. The main advantage of using real-time PCR for amplification of reverse transcription products relies on the possibility of collecting data during the reaction exponential phase, when the quantity of the PCR product is directly proportional to the amount of template viral cDNA, thus providing information on the virus titer. Furthermore, a combined melting analysis allows evaluating accurately the reaction specificity [67].
With the exception of the putative viral ORFan sequences, the viral species found in the strains analyzed belong to virus families already reported in yeast [27] or in other unicellular organisms [68,69]. However, this is the first record of a virus belonging to the family Paritiviridae in S. cerevisiae, as well as the first report of infection by viruses of Totiviridae and Mitoviridae families in S. bacillaris. Despite the high infection rate among the Saccharomyces isolates, the number of viral species found is low, confirming the data reported in the literature for wine yeasts [29]. Particularly, the average number of viral sequences per infected isolate is low compared to what is usually reported in filamentous fungi often subjected to complex infections with viruses belonging to different families [8,11,13,23]. In this work, we identified multiple viral infections in S. bacillaris and in all S. cerevisiae strains with the exception of S. cerevisiae NDA21 and Lalvin RC212, infected only by a narnavirus and a totivirus, respectively. Our results highlight the presence of natural mixed infections of viruses belonging to different families in both Saccharomyces and non-Saccharomyces yeasts. Interestingly, no toti-narnavirus combinations were found in this study nor are they reported in the previous literature, although these families are the most widespread and were widely studied in both commercial and wild yeasts of oenological interest in the last fifty years. To investigate this aspect, it will be probably necessary to test a greater number of natural yeast strains not subjected to industrial selection using a high-throughput sequencing approach.
The associations of L-A totiviruses with dsRNA M satellites reported in the literature for killer strains [30,70] were confirmed in this work in all K and N commercial S. cerevisiae strains, while they were not found in the S strains. In particular, all the L-A2 type maintained M2 satellites while L-A-lus maintained either Mlus or M2, as reported in wild-type strains [71]. Moreover, we did not find a specific correspondence of L-BC genetic variants with L-A genotypes, confirming the data reported in the literature [30]. Considering the intra-specific nucleotide diversity for both totivirus species, the highest values calculated for L-A (22%) and for L-BC (4.5%) (Supplementary Tables S3 and S5) fall within the range reported in the literature [30]; moreover, these ScV-L-A and ScV-L-BC isolates clustered in phylogenetic groups already described ( Supplementary Figures S1 and S3), confirming the distribution of different genetic types.
Toxin production was predicted in six killer (K) and six neutral (N) Saccharomyces commercial strains ( Table 2). The variability highlighted among the M2 dsRNAs from both K and N yeasts resulted in eight amino acid substitutions distributed along the whole sequence that cannot be directly associated with possible alterations of the toxin efficacy, with the exception of the N-strain Zymaflore FX10. For this strain, a lysine-glutamic acid substitution at position 220 could reduce the killing action of the K2 toxin, as demonstrated for other amino acid substitutions in the same KR cleavage site which is determinant for the conversion of the preprotoxin into the protoxin [72]. On the other hand, M-lus from the N strain Lalvin ICV D 254 showed a defective ORF that could compromise the synthesis of the functional K-lus toxin. Being neutral yeasts classified into two main categories, inactive toxin secretors and non-secretors [35], it is possible that the neutral phenotype of the remaining N commercial strains such as Uvaferm 43 and Zymaflore VL1 involves different mechanisms as those that alter the toxin secretion.
Despite the complexity of the family Narnaviridae which includes also plant and animal viruses, only three narnavirus species have been described in yeast [29,73]. Their replication mechanism was thoroughly studied by reverse genetics using an infectious clone of ScNV-20S [74]. As for all narna-like viruses, the ScNV-NDA21 monopartite genome encodes only for the RdRp protein, which is closely related to the reference ScNV-20S and NV-I329 isolates ( Figure 3). The intraspecific variability of Saccharomyces narnavirus species has not been yet fully investigated, few complete genomic sequences are available in GenBank; however, the NDA21 RdRp amino acid sequence showed a very strong correspondence with the reference isolates reported in the literature. For this reason, considering the amino acid identity value of 50% as the accepted threshold for the demarcation of species within the narnavirus genus [42], we believe that ScNV-NDA21 can be referred to as a new genetic variant of ScNV-20S.
Two undescribed mycoviruses, detected in two commercial S. cerevisiae strains, were closely related to members of the genus Cryspovirus of family Partitiviridae, which to date includes only distinctive protozoal dsRNA viruses isolated from Cryptosporidium spp. [69]. The family Partitiviridae currently comprises four other genera, Alphapartitivirus, Betapartitivirus, Gammapartitivirus, and Deltapartitivirus [69,75]. According to Vainio [69], genera within this family are identified based on their hosts (which is characteristic for each genus), genome and protein lengths, pairwise RdRp aa identity <24% for viruses from different genera, and separate phylogenetic groupings of RdRp aa sequences. The high RdRp and CP identity values between D254 and U43 isolates suggest the presence of a single viral species infecting S. cerevisiae. The high bootstrap value obtained from the RdRp phylogenetic analysis supports the inclusion of ScCV1-D254 and ScCV1-U43 in the genus Cryspovirus with Cryptosporidium parvum virus 1 (CSpV1) (Figure 2).
A pairwise comparison among the most closely related cryspoviruses showed that ScCV1-D254 and ScCV1-U43 had the highest identity percentages when aligned to CSpV1 (GenBank BAU19319) RdRp (37%) or to Cryptosporidium meleagridis virus CP (GenBank ABB02500) (21.8%) (Supplementary Tables S8 and S9). Both values are well below the threshold to establish a new species in the family Partitiviridae (aa identities of 80% for CP and 90% for RdRp) provided by ICTV guidelines [69]. Considering the distinct phylogenetic grouping of RdRp sequences, the protein lengths as well as their host specificity, we propose ScCV1-D254 and ScCV1-U43 as isolates of a new putative species belonging to the Cryspovirus genus, tentatively named Saccharomyces cerevisiae cryspovirus 1 (ScCV1). To our knowledge, this is the first report of a crispovirus in budding yeasts. A further analysis of the 5 -terminal and 3 -terminal genomic sequences will define their phylogenetic relationship within the family Partitiviridae.
Among the non-Saccharomyces strains, only Starmerella bacillaris Cz12 was infected by two putative new viral species belonging to the Totiviridae and the Mitoviridae families. S. bacillaris (synonym Candida zemplinina) is frequently isolated from grapes, musts, and wines in different parts of the world, and it is considered as one of the most promising species for enological use because of its ability to resist alcohol in advanced stages of winemaking [76].
The Cz12 strain was isolated in Sicily from grape musts [77] and selected for its ability in sequential fermentations with S. cerevisiae to produce wines with about 50% more glycerol than S. cerevisiae alone [78,79].
The first sequence identified in S. bacillaris showed a conserved ORF organization typical of totiviruses, including the eight conserved regions in the ORF2, and the ribosomal slippage site between the CP and RdRp ORFs [68,80]. Considering the criteria established for the demarcation of species included in the Totiviridae family (host specificity and amino acid identity values below 50%) [81], the virus identified here may represent a new species tentatively named SbTV1 to be included in a putative new genus within the family Totiviridae together with Red algae totivirus 1 (RaTV1). The second viral sequence found in S. bacillaris was a putative new mitovirus species showing the best correspondence with other fungal mitoviruses. The ICTV species demarcation criteria for mitoviruses indicate amino acid identity values of 40% as thresholds to discriminate among different species. In our case, identity values among SbMV1 and the tree reference species Sclerotinia sclerotiorum mitovirus 26 (SsMV26), Entomophthora muscae mitovirus 6 (EnmuMV6), and Entomophthora muscae mitovirus 29 (EnmuMV29), ranged from 28% to 32.2%, supporting the hypothesis of a new species as also shown in the Maximum likelihood phylogenetic tree ( Figure 3). To our knowledge, this is the first mitovirus found in yeasts.
Our bioinformatic pipeline to identify ORFans and in specific putative viral ORFans identified three segments encoding for proteins, but one of them turned out to be a DNA encoded ORFan, therefore unlikely to be of a current virus infection; nevertheless, the origin and function of this ORFan remains an interesting future task to be completed. Even more interesting are the two segments (ORFan2 and ORFan3) that are replicated via an RNA intermediate. Given that these two segments are in a yeast isolate containing a totivirus, it is tempting to speculate that they could be accessory satellite segments. Generally, satellites conserved terminal sequences with their helper virus, and so far, our inspection did not reveal any conservation of the termini (but we have only in-silico assembly that might miss the viral 5 and 3 ends). In the future, we will complete RACE for the ORFan segments and possibly derive in-vitro or in-vivo transcripts to test if they are infectious in yeast.
Mycoviruses commonly cause latent infections in fungi, and in some cases, they are able to alter remarkably the hosts biology without evident symptoms, such as for the killer system in totivirus-infected strains. In 2002, Lopez [73] demonstrated that the relative abundance of the narnavirus ScNV-20S and ScNV-23S RNA copy number increased in industrial and laboratory S. cerevisiae strains exposed to nutritional stress conditions, without any evident correlation between the viral infection and biological characteristics of the strains. Nevertheless, beyond the amount of literature regarding these systems, the results of the virus-host interaction are still poorly understood in yeasts, particularly in the case of new viral species and yeast species other than S. cerevisiae with a high oenological potential. As reported in studies on unicellular organisms, viral infections can influence the virulence of their hosts [82,83]. In some cases, the virus would act as a mutual symbiont by increasing the virulence (hypervirulence), as described for viruses infecting the parasitic protozoa Leishmania [84], Trichomonas [85], and Cryptosporidum [86]. In other cases, a deleterious effect of the virus on the parasite could suggest a decrease in virulence (hypovirulence) such as for Giardia [87]. Variation in the virulence and fecundity of Cryptosporidium parvum isolates were reported by studies on cryptosporidiosis [86] suggesting that CsPV could have a role in the fecundity and possibly virulence of C. parvum by increasing the oocysts' production.

Conclusions
This study confirmed the presence of viruses associated with useful technological characteristics of yeast strains already employed for wine production and unveiled new viral species never reported in unicellular fungi before. The presence of complex mixed infections would imply unpredictable biological effects for the host, possibly leading to the alteration of the technological characteristics of interest, particularly in the case of commercial strains. Multiple infections may induce intra-host interactions among different viruses resulting in both synergistic and antagonistic effects as described for the interactions among plant [88] and fungal [89] viruses. These aspects portend the necessity of a periodic quality control to verify the stability of viral infections within the strain of interest.
Our relatively cheap technological approach to investigating the virome associated with yeast could be employed on larger scale studies that include the natural collection of isolates from various unexplored ecological niches. Applications of high-throughput molecular techniques coupled with protocols for transferring the infection to healthy strains will unravel the effect of viral infections on yeast biology as well as on their potential technological properties. Furthermore, given the natural yeast infection by the mitovirus and a partiti-like virus, in the future we will investigate their possible ability to infect S. cerevisiae and therefore establish a new virus-host model system to study mitovirus and partitivirus replication.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/v14010052/s1, Figure S1: Phylogenetic analysis of ScV-LA viral isolates; Figure S2: Alignment of the K2 preprotoxins from different M dsRNAs; Figure S3: Phylogenetic analysis of ScV-L-BC viral isolates; Table S1: Specific primers designed in this study; Table S2: Sequences of isolates retrieved from the GenBank; Table S3: Nucleotide percent identity and amino acid percent identity of ScV-LA isolates for Coat protein gene; Table S4: Nucleotide percent identity and amino acid percent identity of ScV-M-2 isolates for K-2 killer toxin; Table S5: Nucleotide percent identity and amino acid percent identity of ScV-L-BC isolates for partial RdRp and CP; Table S6: Amino acid percent identities of SbTV1 for Coat Protein gene; Table S7: Amino acid percent identities of SbTV1 for RNA dependent RNA polymerase gene; Table S8: Amino acid percent identities of ScCV1 isolates for RNA dependent RNA polymerase; Table S9: Amino acid percent identities of ScCV1 isolates for Coat protein; Table S10: Amino acid identities of ScV-NV-NDA21 for RNA dependent RNA polymerase; Supplementary online Text S1: ORFan FASTA sequences.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.